Data visualisation — making a plot with ggplot2
2025-09-24
Visualisation involves representing data by lines, shapes, colours, etc.
Map data to visual channels — some channels more effective than others
ggplot provides a set of tools to
A ggplot comprises several main elements
Load some packages that we need for plotting and working with data
ggplot()Main function is ggplot()
data, the data frame containing the datadata to aesthetics with aes()Add layers to plot vis +
Geoms are the main layer-types we add to influence the plot
Geoms by default inherit the data and aesthetics from the ggplot() call
Two main ways in which data tend to be recorded
In long-format:
In wide-format
ggplot requires data in long form
# A tibble: 344 × 8
species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
<fct> <fct> <dbl> <dbl> <int> <int>
1 Adelie Torgersen 39.1 18.7 181 3750
2 Adelie Torgersen 39.5 17.4 186 3800
3 Adelie Torgersen 40.3 18 195 3250
4 Adelie Torgersen NA NA NA NA
5 Adelie Torgersen 36.7 19.3 193 3450
6 Adelie Torgersen 39.3 20.6 190 3650
7 Adelie Torgersen 38.9 17.8 181 3625
8 Adelie Torgersen 39.2 19.6 195 4675
9 Adelie Torgersen 34.1 18.1 193 3475
10 Adelie Torgersen 42 20.2 190 4250
# ℹ 334 more rows
# ℹ 2 more variables: sex <fct>, year <int>
Say we want to plot flipper length (flipper_length_mm) against bill length (bill_length_mm)
We tell ggplot() where to look for variables, but haven”t specified any mappings yet
Assigned the ouput of the ggplot() call to the object p (could call p anything)
Alt + -
or
Option + -
types the assignment operator <-
We specify mappings between variables and aesthetics via the mapping argument
Use the aes() function to specify the mappings
This sets up a mapping between our two variables and the x and y aesthetics
The x and y aesthetics are the \(x\) and \(y\) coordinates of the plot
We can draw the plot by print()ing the object p
What do you think you”ll get if you print p?
Only the scale for the x and y aesthetics is drawn
Need to tell ggplot() how we want the data drawn
Need to choose a geometric object or geom
geoms are functions with names geom_<type>()
A geom adds a layer to an existing plot
For a scatterplot, we represent the \(x\), \(y\) pairs via points geom_point()
geom_smooth() adds a smootherHere we see the effect of a statistical summary associated with a geom
Didn”t need to tell each geom what data or mappings to use
Information is inherited from the main ggplot() object
Can override this
Mappings are in aes(), settings go outside aes()
Mappings are inside aes(), settings go outside aes()
alpha controls transparency, size controls how big things are
labs()ggplot(
penguins,
aes(
x = flipper_length_mm,
y = bill_length_mm
)
) +
geom_point(alpha = 0.3) +
geom_smooth(
method = "lm",
colour = "orange",
se = FALSE,
size = 2
) +
labs(
x = "Flipper length (mm)",
y = "Bill length (mm)",
title = "How big are penguins anyway?",
subtitle = "Data points are individual penguins",
caption = "Source: palmerpenguins")labs() — setting plot labelsYou can save time and effort by reusing plot elements
ggsave() — Saving your workPlots can be rendered to disk in a range of formats — PNG, PDF, …
Type of file depends on the extension given in filename
ggsave() saves the last ggplot object plotted
ggsave() — Saving your workPlots can be rendered to disk in a range of formats — PNG, PDF, …
Type of file depends on the extension given in filename
ggsave() saves a specific ggplot object if given one
ggsave() — Specifying sizeggsave() always saves objects in inches & takes the size from the device if not specified
Can set width and height to numeric values and select the units via units